DeepSeek-R1 Local Deployment

This guide describes how to locally deploy the DeepSeek-R1 LLMs on NVIDIA Jetson Orin devices using Ollama, a lightweight inference engine, to enable offline AI interaction with a simple and efficient installation process.

1. Overview

Large language models (LLMs) like DeepSeek-R1 are gradually becoming a core component of edge intelligence applications. Running them directly on Jetson Orin offers key benefits：

Fully offline operation
Low-latency response
Enhanced data privacy

This guide including：

Environment preparation
Installing Ollama
Running the DeepSeek-R1
(Optional) Using Open WebUI for a web-based interface

2. Environment preparation

Hardware

Component	Requirement
Device	Jetson Orin（Nano / NX ）
Memory	≥ 8GB（larger models require more）
Storage	≥ 10GB（varies by model size）
GPU	NVIDIA GPU with CUDA support

Software

Ubuntu 20.04 / 22.04（JetPack 5.1.1+ recommended）
NVIDIA CUDA Toolkit and drivers （included with JetPack）
Docker (optional, for containerized deployment)

⚙️ Tip: Use jetson_clocksand check nvpmodel to enable maximum performance mode for the best inference results.

3.Ollama Installation（Inference Engine）

Option A: Native Script Installation

Open your terminal or command prompt and run the following command to install the NativeScript CLI.

curl -fsSL https://ollama.com/install.sh | sh

Installs the Ollama service and CLI tools.
Automatically handle dependencies and configure the background service.

Option B: Docker Deployment

sudo docker run --runtime=nvidia --rm --network=host \
  -v ~/ollama:/ollama \
  -e OLLAMA_MODELS=/ollama \
  dustynv/ollama:r36.4.0

🧩 The Docker version is maintained by the NVIDIA community (dustynv) and optimized for Jetson.

Verify Ollama is Running (refer to the code below)

ss -tuln | grep 11434

Expected output:

LISTEN 0 128 127.0.0.1:11434 ...

If port 11434 is listening, the Ollama service has started successfully.

4. Running the DeepSeek-R1

Getting Start the Model

To run the 1.5B parameter version：

ollama run deepseek-r1:1.5b

Ollama will automatically download the model if it is not cached locally.
Starts an interactive conversation in the command line.

💡 Depending on your hardware capability, you can replace 1.5b with 8b、14b ,etc.

Model Version Comparison

Version	Memory Requirement	Notes
1.5B	~6–8 GB	Suitable for Orin Nano/NX
8B+	≥ 16 GB	Requires AGX Orin
70B	🚫	Not supported on Jetson

5. Web Interface (Open WebUI)

Open WebUI provides a user-friendly browser-based chat interface. open_webui

Install Open WebUI (using Docker)

sudo docker run -d --network=host \
  -v ${HOME}/open-webui:/app/backend/data \
  -e OLLAMA_BASE_URL=http://127.0.0.1:11434 \
  --name open-webui \
  --restart always \
  ghcr.io/open-webui/open-webui:main

Access the WebUI

Visit your browser with：

http://localhost:3000/

You can interact with the DeepSeek-R1 model graphically
View conversation history, and review model responses directly in the browser.

6. Performance Optimization

Optimization Area	Description
Memory Usage	Use a smaller model (e.g., 1.5B) or enable swap
Jetson Performance	Enable `MAXN` and run `jetson_clocks`
Model Caching	Ensure sufficient space in the `~/ollama` directory
Runtime Monitoring	Use `htop`or`tegrastats` to monitor system load

📉 The initial model load may take about 30 seconds to 1 minute; subsequent runs will be faster thanks to caching.

7. Troubleshooting

Issue	Solution
Port 11434 not listening	Restart Ollama or check Docker container status
Model fails to load	Insufficient memory; try using a smaller version (e.g., 1.5B)
Cannot access Web UI	Verify Docker is running and connected to the host network
Ollama command not found	Re-run the installation script or add it to your `$PATH`

8. Appendix

Example Directory

~/ollama/                # Model cache directory  
~/open-webui/            # WebUI persistent data

1. Overview​

2. Environment preparation​

Hardware​

Software​

3.Ollama Installation（Inference Engine）​

Option A: Native Script Installation​

Option B: Docker Deployment​

Verify Ollama is Running (refer to the code below)​

4. Running the DeepSeek-R1​

Getting Start the Model​

Model Version Comparison​

5. Web Interface (Open WebUI)​

Install Open WebUI (using Docker)​

Access the WebUI​

6. Performance Optimization​

7. Troubleshooting​

8. Appendix​

Example Directory​

References​